CentOS 7 : Install TensorFlow (with GPU) : Server World

Install TensorFlow (with GPU Support)

2018/01/29

	Install TensorFlow which is Machine Learning Library by Google. To use TensorFlow, it's possible to select APIs for some languages like Python, C, Java, Go. On this example, use Python 2.7. (The version requirements is Python 2.7 or Python 3.3 and later) Furthermore on this example, Install officially provided binary module. For binary module, it provides GPU supported and no GPU supported version. On here, install GPU supported version.
[1]	For GPU supported version, CUDA 7.0 or greater is required. This example shows with CUDA 9.0 installed environment like here.
[2]	For other System requirements, cuDNN v3 (CUDA Deep Neural Network library) or greater is required. cuDNN is provided by NVIDIA, Access to the site below and download cuDNN. (To download, it needs to register a developer account) ⇒ https://developer.nvidia.com/rdp/cudnn-download
[3]	Install cuDNN wihch you did download above.

[root@dlp ~]#

tar zxvf cudnn-9.0-linux-x64-v7.tgz

[root@dlp ~]#

cp ./cuda/include/cudnn.h /usr/local/cuda-9.0/include/

[root@dlp ~]#

cp -a ./cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/

[root@dlp ~]#

ldconfig

[root@dlp ~]#

echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/extras/CUPTI/lib64' >> /etc/profile.d/cuda90.sh

[root@dlp ~]#

source /etc/profile.d/cuda90.sh

[4]	Install other required packages.

# install from EPEL

[root@dlp ~]#

yum --enablerepo=epel -y install python2-pip python-devel

[5]	Install TensorFlow.

# update pip first

[root@dlp ~]#

easy_install -U pip

Searching for pip
Reading https://pypi.python.org/simple/pip/
Best match: pip 9.0.1

[root@dlp ~]#

pip install --upgrade tensorflow-gpu

Collecting tensorflow-gpu
  Downloading tensorflow_gpu-1.5.0-cp27-cp27mu-manylinux1_x86_64.whl (201.9MB)

.....
.....

Successfully installed absl-py-0.1.9 backports.weakref-1.0.post1 bleach-1.5.0 
enum34-1.1.6 funcsigs-1.0.2 futures-3.2.0 html5lib-0.9999999 markdown-2.6.11 
mock-2.0.0 numpy-1.14.0 pbr-3.1.1 protobuf-3.5.1 setuptools-38.4.0 six-1.11.0 
tensorflow-gpu-1.5.0 tensorflow-tensorboard-1.5.0 werkzeug-0.14.1 wheel-0.30.0

[6]	Verify installation with a common user. For the messages [CPU supports ***], it's no ploblem. It means TensorFlow Binary has compiled without CPU feature listed in the messages.

[cent@dlp ~]$

vi hello_tensorflow.py

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

[cent@dlp ~]$

python ./hello_tensorflow.py

2018-01-28 11:27:55.110172: I tensorflow/core/platform/cpu_feature_guard.cc:137] 
                            Your CPU supports instructions that this TensorFlow binary 
                            was not compiled to use: SSE4.1 SSE4.2
2018-01-28 11:27:56.522539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]
                            Found device 0 with properties:
                            name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.8475
                            pciBusID: 0000:03:00.0
                            totalMemory: 5.93GiB freeMemory: 5.86GiB
2018-01-28 11:27:56.522640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] 
                            Creating TensorFlow device (/device:GPU:0) -> (device: 0, 
                            name: GeForce GTX 1060 6GB, pci bus id: 0000:03:00.0, compute capability: 6.1)
Hello, TensorFlow!

[7]	Try to experience TensorFlow to run officially provided basic Model.

[cent@dlp ~]$

mkdir tensorflow

[cent@dlp ~]$

cd tensorflow

[cent@dlp tensorflow]$

git clone https://github.com/tensorflow/models.git

[cent@dlp tensorflow]$

cd models/official/mnist

[cent@dlp mnist]$

python mnist.py

INFO:tensorflow:Using default config.

.....
.....

INFO:tensorflow:loss = 5.4853288e-05, step = 47901 (0.749 sec)
INFO:tensorflow:Saving checkpoints for 48000 into /tmp/mnist_model/model.ckpt.
INFO:tensorflow:Loss for final step: 6.582842e-05.
INFO:tensorflow:Starting evaluation at 2018-01-28-04:22:02
2018-01-28 13:22:03.039551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] 
    Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 6GB, 
    pci bus id: 0000:03:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from /tmp/mnist_model/model.ckpt-48000
INFO:tensorflow:Finished evaluation at 2018-01-28-04:22:04
INFO:tensorflow:Saving dict for global step 48000: accuracy = 0.0987, global_step = 48000, loss = 0.038478132

Evaluation results:
        {'loss': 0.038478132, 'global_step': 48000, 'accuracy': 0.0987}

[8]	To see the state of Graphic Card during the deep learning, It's possbile to find GPU and GPU Memory are working well.

[root@dlp ~]#

nvidia-smi

Tue Jan 28 13:21:54 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.12                 Driver Version: 390.12                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:03:00.0 Off |                  N/A |
| 38%   68C    P2    87W / 120W |   5941MiB /  6077MiB |     70%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     22493      C   python                                      5931MiB |
+-----------------------------------------------------------------------------+